Ser. No. 08/182,183 



SYNE225-E 



116. (Newly add£d). A recombinant or isolated nucleic acid molecule which encodes 
glial cell line-derived neurotrophic factor having a molecular weight of about 31-42 kD on 
non-reducing' SDS-PAGE, a molecular weight of about 20-23 kD on reducing SDS^PAGE, and 
which promotes dopamine uptake in dopaminergic neurons at a concentration of 
approximately 60 pg/ml. 

Please cancel Claim 87 without prejudice. Applicants intend to pursue the antibody-related 
claims separately in a continuation application. 

REMARKS 

The relevant claim provisions have been amended to specify that the nucleic acid 
sequence may encode a protein having an amino acid, sequence which at least appijoximately 
70% identical to the amino acid sequence set forth in SEQ ID NO:4 or SEQ ID NO:6 when four 
gaps in a length of 100 amino acids may be introduced to assist in that alignment. The 
amendment provides a clearly defined meaning for the .previously used "homology" 
terminology, and support for the amendment can be found in the specification, at page 20, 
lines 12-22. The relevant claim provisions also have been amended to clarify the 
conditions for hybridization of the nucleic acid sequences. Support for this amendment can 
be found in the specification at page 63, lines 12-22 (Example 2). Thus, the claims have 
been amended to describe the disclosed sequences in terms of specific SEQ ID NOs, as well as 
structural and functional characterizations, as suggested by the Examiner. 

The amendments should not be construed as an acquiescence to the rejections and have 
been made solely to expedite the prosecution of this application. Applicants reserve the 
right to pursue the claims as originally filed in another application (s). The amendments do 
not add any new matter. The above-listed claim amendments were made to better clarify the 
claims in response to the Examiner's objections, and thereby, make them suitable for 
allowance or place them in better form for appeal. Therefore, it is respectfully requested 
that the amendments be entered at this time. 

Section 112. First Paragraph Rejections 

The Examiner rejected claims 26, 31, 36, 42-55, 75-87 and 89-94 under §112, 
first paragraph, stating that the disclosure is enabling only for those claim provisions 

- 10 - 



Ser. No, 0871S2,183 SYNE225-E 



108. (Newly added) A method for the expression of glial cell line-derived 
neurotrophic factor, comprising modifying a cell to express a nucleic acid sequence encoding 
a polypeptide comprising the amino acid sequence set forth in SEQ ID NO:6 or a polypeptide 
which is at least 70% identical to the amino acid sequence set forth in SEQ ID NO:6 when 
four gaps in a length ol J 00 ammo acids may be introduced to assist in that alignment. 

109. (Newly added) The method of claim 108, wherein the expressed polypeptide 
is a monomer and wherein the method further comprises refolding expressed polypeptide to 
form a disulfide-bonded dimer. 

110. (Newly added) The method of claim 109, wherein the disulfide-bonded dimer 
is glycosylated. 

111. (Newly added) The method of claim 108, wherein the expressed polypeptide 
is secreted by said cell. 

112. (Newly added) The method of claim 111, wherein the secreted polypeptide is 
a disulfide-bonded dimer. 

113. (Newly added) A recombinant or isolated nucleic acid sequence, comprising: 

(a) sequences comprising nucleotides -78 through -1 of SEQ ID NO:5, 59 through 208 
of SEQ ID NO:8 or 59 through 289 of SEQ ID NO:25; or 

( b ) sequences encoding all or a portion of amino acids -26 to -1 of SEQ ID NO:5, 
1 through 50 of SEQ ID NO:8 or -77 through -1 of SEQ ID NO:25; 

wherein said sequences are used in the expression of glial cell line-derived neurotrophic 
factor. 

114. (Newly added) A recombinant or isolated nucleic acid sequence comprising a 
sequence complementary to a nucleic acid sequence of claim 26 

115. (Newly added) A recombinant or isolated nucleic acid sequence according to 
claim 26 further comprising an amino-terminal methionine residue. 



- 9 - 



Ser. No. 08/182,183 



SYNE225-E 



which are limited to tHg sequences referred to by specific SEQ ID NOs. In particular, the 
Examiner stated that the specification does not provide one skilled in the art with a 
description of what is meant by the terms "homologous* and "hybridize". The Examiner, 
therefore, concluded that such nucleic acid sequences could not be obtained by those of 
ordinary skill in the art without resorting to an undue amount of experimentation. 
Applicants traverse this objection/rejection on the basis that the present disclosure, in 
accordance with §112, first paragraph, provides all of the information that is necessary for 
one of ordinary skill in the art to make and use these nucleic acid sequences without undue 
experimentation. 

First, as suggested by the Examiner, the intended meaning of the term "homology" 
has been clarified, using the specific terminology of the specification. The description of 
percent homology or percent identity is clearly set forth in the specification by the 
definition of the means of calculating the percentage of homology. Applicants direct the 
Examiner's attention to page 20, lines 12-22, which states: 

The percentage of homology as described herein is calculated as the percentage of 
amino acid residues found in the smaller of the two sequences which align with 
identical amino acid residues in the sequence being compared when four gaps in a 
length of 100 amino acids may be introduced to assist in that alignment as set 
forth by Dayhoff, in Atlas of Protein Sequence and Structure Vol. 5, p. 124 
(1975), National Biochemical Research Foundation, Washington, D.C., 
specifically incorporated herein by reference. 

As described above, the claims have been amended to clarify that "homology" is 
defined by the percent of amino acids which are identical between the sequences being 
compared. Applicants, therefore, provided a clear description in the specification of how 
homology is determined, and with the amendment the claims are fully interpretable. 

Second, the Examiner states that one skilled in the art would not understand how to 
make and use the nucleic acid sequences encoding homologous polypeptides. The Examiner 
states that one skilled in the art could not produce such a nucleic acid sequence without undue 
experimentation because there are no teachings in the specification or prior art teachings to 
enable one skilled in the art to rationally identify such sequences. In particular, the 
Examiner states that one skilled in the art could not produce a functional GDNF polypeptide 
in the form of a substitution variant without undue experimentation because there are no 



- 11 



Ser. No* 08/182,183 



SYNE225-E 



prior art teachings to enable one skilled in the art to rationally design such a modified 
polypeptide. This aspect of the rejection is also respectfully traversed. 

To fulfill the enablement requirement, an application must describe how to make and 
use the claimed invention. While experimentation needed to practice the invention must not 
be "undue experimentation", it is well known that enablement is not precluded by the 
necessity for some experimentation such as routine screening (see In re Wands 8USPQ2d 
1400-1407 (GAFC, 1988)). 

"the determination of what constitutes undue experimentation in a given 
case requires the application of a standard of reasonableness, having due regard 
for the nature of the invention and the state of the art ... the test is not merely 
quantitative, since a considerab le amount of experimentation is permissible, if it 
is merely routine, or if the specification in Question provides a reasonable 
amount of guidan ce with respect to the direction in which the experimentation 
should proceed ." [emphasis added] 

As demonstrated in the specification, following the initial discovery of a GDNF 
protein sequence, Applicants proceeded to identify rat and human nucleic acid sequences 
using the fully detailed procedures provided in Example 2. These hybridization procedures 
demonstrated that probes could readily be designed and used to identify various GDNF 
sequences across species. At the time of filing the present application it was also well known 
that naturally occurring sequences so identified could then be compared and that the 
conserved residues and regions in these sequences could readily be determined. Therefore, 
with the specific disclosure of the rat and human sequences, those skilled in the art were 
provided with the information needed to obtain naturally occurring variant nucleic acid 
sequences and identify conserved residues and regions in these sequences, just as was 
described in the specification. With this information, those skilled in the art were provided 
with the means of making and using nucleic acid sequences which encode the polypeptides as 
well as the blueprint for the rational design of non-naturally occurring variant proteins 
and their encoding nucleic acid sequences. 

The specification describes precisely the means for identifying and isolating nucleic 
acid sequences encoding GDNF polypeptides. Using probes based on the first discovered 
protein, the rat nucleic acid sequence and polypeptide were identified (Please see 
Example 2(a)). Using probes based on the rat nucleic acid sequence, hybridization 
procedures (using conditions of reduced stringency) led to the identification of the human 



- 12 - 



Ser. No. 08/182,183 



SYNE225-E 



nucleic acid sequence/- {Please see Example 2(c)). Thus, the disclosure distinctly describes 
the means of obtaining and identifying variant sequences using defined and exemplary probes 
and hybridization techniques. It is also well within the competence of those of ordinary skill 
in the art to identify nucleic acid sequences from a variety of species using the sequences 
described in the specification. As a result, not only is the identification of the claimed 
sequences well within the competence of those of ordinary skill in the art, the specification 
also "provides a reasonable amount of guidance with respect to the direction in which the 
experimentation should proceed". 

This guidance is also demonstrated with the further identification of nucleic acid 
sequences and amino acid sequences of other species such as chicken and Xenopus (South 
African clawed toad) attached hereto as Exhibit 1. Furthermore, it is well known that the 
polypeptides encoded by such sequences may readily be compared for percent identity, as 
described in the specification, to identify the amino acid residues and regions which are 
conserved between species, and conversely, those which vary. For example, the comparison 
of the rat and human amino acid sequences demonstrate a homology which can be described as 
an approximately 93% identical amino acid sequence. 

With regards to the particular features, those skilled in the art readily appreciate 
that residues which are not conserved in nature might be modified. Moreover, as found in 
nature, such modifications are typically designed to be conservative amino acid substitutions 
so that the function of the polypeptide is not changed. These modifications are well with the 
knowledge and ability of the ordinarily skilled artisan. Evidence of the common availability 
and practice of these techniques is provided in the specification as well as in commonly used 
references as discussed further below. In addition to the disclosure of the specifically 
identified nucleic acid and protein sequences in the present specification, those skilled in the 
art were provided with the means as well as the guidance to identify, design, make and use 
variant proteins and the encoding nucleic acid sequences without the need for undue 
experimentation. 

The ability to substitute one amino acid for another in a protein having a fully 
defined amino acid sequence is well established in the art. Once a starting sequence is 
known, it is well-appreciated that each of certain amino acid residues may be substituted 
with a different amino acid residue without affecting the structural integrity or activity of 
the molecule. For example, those of ordinary skill in the art appreciate that sequences 
having at least 70% identity and the preserved function of the native sequence involve 
modifications that are generally termed "conservative", e.g., the substitution of one amino 
acid for another whose size, shape, charge, hydrogen bonding capacity or chemical reactivity 
are very similar. Such conservative variations have little or no effect on the overall net 

- 13 - 



Ser. No. 08/182,183 



SYNE225-E 



charge, polarity or hydrophobicity of the variant molecule, and therefore, using these 
modifications there is a clear and reasonable expectation of successfully producing variant 
polypeptides having both the required homology and function. 

The ability to select such amino acid residues for substitution is amply demonstrated 
by such common textbook descriptions of amino acid characteristics as found in Stryer's 
Biochemistry, 3rd Edition (pp. 16-21, W.H. Freeman and Company, New York (1988)). 
As taughfcin Stryer, the amino facids may; betgrouped- based on Jheir-attributes^and the - 
replacement or substitution of one member in a group with another member of that group 
will be conservative so as to have little or no effect on the overall net charge, polarity, or 
hydrophobicity of the protein. These conservative "substitutions may be summarized as 
indicated below. 

Conservative amino acid substitutions 

Basic : arginine 

lysine 

histidine 
Acidic : glutamic acid 

aspartic acid 
Polar : glutamine 

asparagine 
Hydrophobic : leucine 

isoleucine 

valine 

Aromatic : phenylalanine 
tryptophan 
tyrosine 

Small: glycine 
alanine 
serine 
threonine 
methionine 

Furthermore, the skilled worker has ready access to commonly used molecular 
sequence analysis scoring systems. Such analysis systems are based on datasets similar to 



- 14 



Ser. No. 08/182,183 



SYNE225-E 



that presented in Gribskpv and Devereux, Sequence Analysis Primer (Chapter 3, pp. 1 34- 
137 and 233, Stockton Press, New York (1991)) which describes the work of 
Dayhoff et al. As depicted in the table on page 233, certain amino acid residues are known 
as being commonly substituted by nature. These naturally occurring substitutions generally 
correlate to the amino acid groupings discussed above in Stryer. For example: aspartic acid 
(D) may be replaced with glutamic acid (E) or asparagine (N); arginine (R) may be 
replaced with lysine (K); glutanriine (Q) may be replaced with asparagine (N) or aspartic 
acid (D); and serine (S) may be replaced with threonine (T) or glycine (G). This widely 
accepted means of scoring protein sequence alignments is used as the standard by which the 
newer systems, such as BLAST and FASTA, compare themselves. Because such modifications 
occur naturally, a sequence modification plan made on the basis of such substitutions would 
not be expected to change the binding attributes or activity of the variant molecule. Thus, 
such conservative substitutions are clearly available for use to design variants using the 
isolated sequences as the protein blueprint. 

In addition, recombinant technology and chemical synthesis procedures for the 
manufacture of such designed compounds are so well established as to be available in college 
textbooks and standard laboratory manuals. As cited in the application, one such common 
reference is Sambrook et a/., Molecular Cloning, A Laboratory Manual (1989). This is a 
standard reference used by those skilled in the art to make modified nucleic acid sequences 
for the expression of variant polypeptides. For example, it provides commonly practiced 
procedures such as oligonucleotide-mediated mutagenesis which may be used to produce the 
variants (pp. 15.51-15.94). 

When the present application was filed (i.e., 1991), all of these analytical 
procedures and modification and production techniques were commonly known, routinely 
used and highly reproducible. The fact that this information appears in common textbooks 
and standard laboratory manuals further demonstrates that such information is neither 
unknown or obscure. Thus, this information is part of the common knowledge which those of 
ordinary skill in the art possess or know where to obtain. Furthermore, the standardization 
of these procedures and techniques exemplifies the fact that polypeptide and nucleic acid 
sequence variants can be designed and produced by known methods without undue 
experimentation. The novel GDNF nucleic acid and amino acid sequences, the means of 
determining homology, and the availability of these standard techniques, supports the 
conclusion that those skilled in the art were given sufficient guidance for the rational design 
of the claimed nucleic acid sequences once presented with the disclosure of the present 
specification. 



- 15 - 



Ser. No. 08/182,183 



SYNE225-E 



Applicants agrere Jhat prior to the present disclosure of the specific GDNF 
polypeptides and nucleic acid sequences there was insufficient information to identify 
naturally occurring GDNF variants or to make variants by modifying the unknown 
sequences. With the discovery of those sequences and their disclosure in the specification, 
however, the level of ability in the art with reference to GDNF proteins was changed. Those 
of ordinary skill now had detailed descriptions othow, to, fiqd, make and idenjify GDNF. In 
addition, they wereiprovide^ with clear^and concise info^ on the nucleic acid and 
amino acid sequences of GDNF. Such information, in combination with aspects of 
hybridization procedures and protein production that are well within the skill in the art, 
provides a broad scope of enablement. 

It should also be noted that this aspect of the amended claims does not encompass all 
sequences, but rather only those GDNF-encoding sequences having specific structural and 
functional characteristics. For example, the provision in amended claim 26 is directed to 
nucleic acid sequences encoding polypeptides having at least approximately 70% identity, as 
discussed above, and having the ability to promote dopamine uptake in dopaminergic 
neurons. The fact that the claimed sequences encode a polypeptide having the function of 
human GDNF means the polypeptides retain the activity of that native protein. It does not 
require any procedure outside those described in the specification to make and use the 
claimed sequences or to evaluate this activity. Thus, the sequences are defined in terms of 
specifically identified sequences and function. 

As demonstrated by the specification, not only was the step-by-step scheme of 
implementing the invention provided, it was successfully accomplished. Because there is 
sufficient guidance for one of ordinary skill to identify, rationally design, produce and test 
the presently claimed molecules following the teachings of the specification and the known 
art, it does not require undue experimentation to make and use the claimed subject matter, 
and therefore, the claims are sufficiently supported. The evidence comprising the teachings, 
of the specification and the known art demonstrates that a skilled artisan would have been 
able to practice the full scope of the claimed subject matter. Because the specification 
enables the claimed subject matter, Applicants respectfully request that this rejection be 
withdrawn. 

The Examiner also asserts that the rejection is applicable to the portions of claims 
using the term "hybridize". The Examiner states that the conditions for hybridization are 
not included in the specification. 

Applicants traverse this rejection and respectfully direct the Examiner's attention 
to page 34, line 30, through page 35, line 20. The described hybridization conditions and 

- 16 - 



Ser. No. 08/182,183 SYNE225-E 



techniques, as well as' the means of determining appropriate hybridization conditions, were 
well known to those of ordinary skiJI in the art at the time of filing the parent application. 
Beltz et al. (Methods in Enzymoldgy 100:266-285, 1983), which is of record in this case, 
sets forth exemplary quantitative considerations as weir as a common strategy for the 
systematic analysis of family members. In addition, textbook descriptions of hybridization 
techniques and conditions, such as Sambrook et al {Molecular Cloning, A Laboratory 
Manual, 2nd edition) which was discussed above, provide descriptions of well-known, 
standard techniques. Further commonly used techniques for the selection of probes and 
hybridization procedures are described in Lathe, J. Mol. Biol., 183, 1-12 (1985). 

In addition to the disclosed general procedures, the specification clearly sets forth 
and fully describes several specific hybridization examples. In fact, both the rat and the 
human GDNF sequences identified in SEQ ID NOs 3 and 5 were identified using fully detailed 
hybridization techniques. Using a labeled degenerate sequence based on the novel protein 
purjfied in Example 1, a hybridization procedure was used to identify a nucleic acid^ 
sequence encoding a rat GDNF molecule (please see page 57, line 19 through page 58, line 
26). Using a PCR-constructed probe made from the rat nucleic acid sequence (about 250 
base pairs), a hybridization procedure was performed, using conditions of reduced 
stringency. This procedure identified a nucleic acid sequence, from a human genomic 
library, encoding a human GDNF molecule (please see page 62, line 4 through page 63, line 
34). A similar procedure was used to identify human GDNF in a cDNA library constructed 
from A+ RNA extracted from the human putamen (please see page 64, line 7 through page 
66, line 34). The disclosure of such reduced stringency procedures is clearly set forth in 
the present specification as well as in the Beltz et al. and Sambrook et al. references which 
are well known to those skilled in the art. 

The claims, however, have been further amended to point out that conditions of 
reduced stringency are used to identify the sequences, as is described in Example 2. The 
sequences are further described as those which have the ability of promoting dopamine 
uptake in dopaminergic neurons. Whether or not a sequence has this combination of 
attributes is readily determined by the skilled practitioner, and therefore, the "metes and 
bounds" of the claims are readily understood. Moreover, the claimed subject matter is 
sufficiently enabled as the specification has provided specific examples and reasonable 
guidance to allow one of ordinary skill in the art to make and use the claimed subject matter. 
Because the Specification enables this aspect of the claimed subject matter, Applicants 
respectfully request that this rejection be withdrawn. 



- 17 - 



Ser. No. 08/182,183 



SYNE225-E 



In summary, each of the presently claimed sequences are fully disclosed in the 
specification. The claimed sequences are delimited in relation to a specifically disclosed and 
referenced SEQ ID NO, a required amino acid sequence identity, and/or a required 
hybridization characteristic* as well as the requisite function of promoting dopamine uptake 
in dopaminergic neurons. Each of these requirements are clearly set forth in the 
specification with detailed descriptions of how to obtain the claimed subject matter. Thus, 
the claimed sequences are fully described in terms sufficient not only to envision but to 
clearly identify the sequences for those skilled in the art and to distinguish them from other 
materials. Furthermore, the claimed subject matter is enabled as the specification has , 
clearly set forth specific examples and reasonable guidance to allow one of ordinary skill in 
the art to make and use the invention as presently claimed. Therefore, the specification 
fully enables the practice of the presently claimed subject matter, and the rejection for lack 
of enablement may properly be withdrawn. 

Lastly, the Examiner rejected certain claims and claim provisions concerning the 
use of anti-GDNF antibody. These claims have been canceled, without prejudice, to be 
pursued in a copending application. Thus, the rejections may properly be withdrawn. 

Section 112. Second Paragraph Rejections 

Claims 26, 31, 42-55, 75-87 and 89-94 were rejected under §112, second 
paragraph. The Examiner restated that the use of the phrases "at least 70% homologous" 
and "at least 90% homologous" are indefinite as having no precise or accepted definition of 
the term "homologous". 

This rejection is respectfully traversed. As discussed above, the specification 
clearly and precisely provide the definition for determining "homology". Because the 
present claims particularly point out and distinctly claim the subject matter of the present 
invention, it is respectfully requested that these rejections properly be withdrawn. 

With respect to how the calculation is made, the Examiner provided an example of 
two sequences ABCDEF and ABEF. Using the definition provided in the specification, these 
sequences would align as follows: 

AB — EF 



- 18 



Ser. No. 08/182,183 



SYNE225-E 



where one gap in the tength of the sequence was introduced to assist in that alignment. As a 
result, the percentage of homology or percent identity is 4/4 or 100%, i.e., the optimal 
number of amino acid residues that identically align divided by the possible number of 
residues to be aligned. If the sequences were of equal length, usingfor example ABXYEF, the 
sequences are aligned without gaps as: 

ABXYEF 

I I I I ■ ■ • . ... 

ABCDEF . •'. 

and the percentage of homology is 4/6 or 67%. 

Bebause Applicants' have provided the definition to be used, there is no need to seek 
some other "definition" available -from teachings in the^ art" td understand and clearly 
delimit the claims/ The description supporting the "homology" provisions of the pending 
claims has been provided in the specification, and the claims have been amended to clarify 
the metes and bounds of the claimed subject matter. Thus, this rejection may properly be 
withdrawn. ' 1 

Claims 26 and 31 were objected to for improper "Markush" language. The claims 
have been amended and this objection is no longer applicable. 

For the foregoing reasons and in view of the amendments, Applicants respectfully 
request reconsideration of and withdrawal of the outstanding rejections. Applicants' 
representative would appreciate the opportunity to talk with the Examiner, in person, to 
facilitate the prosecution of the application. 

Respectfully submitted, 

Daniel R. Curry 
Attorney for Applicants ; 
Registration No: 32,727 
Phbne: (805) 447-8102 
Date: December 26, 1996 

Please send all future correspondence to: 

U.S. Patent Operations/DRC 

M/S 10-1-B 

AMGEN INC. 

Amgen Center 

1840 Dehavilland Drive 

Thousand Oaks, California 91320-1789 



- 19 



:8 



Book No. 



: rom Pbq* Ho — 

. . .u 

A . 
JU*> 

Vr 












1 ' 


























' : j 






i 


L 


_LZ 
















U 








J IT 






i ' ■-■ 






77 








4- 








i/ 






t L 














f rU — 








































J 






1 ^ 
i 

H-4-H 


; i 


r?' 


f d 






A- 
















DA 






'a 


f 










i 

i 


1 ! 






















1 




































IP 






U 




















K 




i 




*+- 






L. 






ft 
















































v- 




t 


! i ! 
i ■ 


































* 














! 


— i — 1 


U-| 



t*w:-:.; 



Human 
Rat 

Chicken 
Zenopus 



STAJtT 

4 

IeMtpif 
sfri|gpps. 



Human 
Rat 
f Chicken 
' Zenopus 



RSKtfjfSG « Li L'A EltetiOTft 





Human 
Rat 

Chicken 
Zenupus 



JRSRR . _ 




RNUfljutK$88£ 
ilyKqlisAiJ^Gci 



i ! 



H r 



6N 



*3 



lU4 



ml 



To P»J« No.-i! 



S^^uejice Analysis; 
Primer 



Edited by 

Michael Gribskov 

. and 
JohnDevereux 

r 



tion 

lysis 
i 

M 

Stockton 
press 

New Yort London Tokyo Melbourne Hong Kong 



134 SIMILARITY AND HOMOLOGY 



Extensions „.:_. 

The primary drawback to dynamic programming methods is that they require 
a considerable amount of computation. This limits their usefulness for tasks 
such as database searching. One simple way to speed up the alignment is to 
calculate only part of the score matrix, usually a diagonal band down the 
center (e.g., SankofT and Kruskal, 1983). This can be safely done, for 
instance, if you know the sequences are homologous aiid do not require large 
gaps in their alignment, or if you have information from a faster method, such 
as hashing (see Hashing and Neighborhood Algorithms) that tells you where 
the most similar regions of the sequences are. Several methods that perform 
a banded alignment and iteratively increase the width of the band until the 
optimal alignment is found have been presented (Ukkonen, 1983). 

A further great increase in alignment speed can be achieved through 
subdivision. If segments in each sequence can be identified, for instance by 
hashing methods (see Hashing and Neighborhood Algorithms), that are so 
similar they are unlikely to match with anything else, the alignment can be 
broken down into two smaller alignments, separated by the matching 
segment. Each equal subdivision increases the speed of the alignment by a 
factor of two. 

Once a cDNA clone is sequenced, one usually wishes to identity the 
protein encoded by the message. Oriearjproach is to translate aUthiee(or six) 
reading frames of the nucleic acid sequence and use the resulting protein 
sequences as probes in a fast database search (e.g., TFASTA - Lipman and 
Pearson, 1985; TBLASTN - Gish et at, in preparation). Unfortunately, this 
approach can be quite sensitive to frameshift errors in the cDNA sequence. 
An alternative to this approach (States and Botstein, 1990) uses dynamic 
prograrnming methods to align the DNA and protein sequence. 



SCORING SYSTEMS 

' The simplest scoring systems for molecular sequence analysis give positive 
scores only to comparisons of identical bases or residues. These scoring 
tables are referred to as an identity or unitary matric» and are stiU trie rjrirnary 
scoring systems used for nucleic acid sequences. 

The average rate of transition (purine to purine or pyrirrridine to 
pyrirnidine) mutations is about three times the average rate of transversion 
(purine to pyrimidine and vice versa) mutations. The rates of insertion/ 
deletion mutations can also be determined from known homologous se- 
quences. These values have been used to calculate scoring tables for nucleic 
acid sequences based on maximum likelihood methods (Bishop and Thomp- 
son, 1986). However, mutation rates and characteristics vary dramatically 
from species to species, from coding to non-coding regions, and from gene 
to gene, making it impossible to define a single best scoring system by 



SIMILARITY AND HOMOLOGY 



^though it has been argued that it may be better to use an equivalent log-odds 
-. matrix calculated for a lower PAM for alignments of unknown sequences 
(Altschul, personal communication). The log of the probability of two 
sequences being evolutionarily related can, m principle, be calculated as the 
sum of the scores for each aligned pair of residues, Le., the alignment score 
if a log-odds matrix is used as the scoring system for the alignment However, 
this is overly simplistic since it ignores the effect of insertions and deletions 
on the probability. 

The accepted point mutation model of protein sequence evolution is 
known to be imperfect in a number of ways. One common criticism is that 
the frequencies of mutations that require more than one base change in the 
DNA sequence is higher than would be expected for a simple Markovian 
model of DNA sequence evolution (Dayhoff and Eck, 1968; Wilbur.1986). 
This criticism, however, is based on a specific model of DNA sequence 
evolution which is, itself, open to criticism (George et aL, 1990). More 
importantly, the accepted point mutation model assumes that all residues in 
a protein are equally mutable: an assumption that is clearly incorrect This 
can be easily seen by examining alignments of families of homologous 
proteins. For a set of six proteins, each sharing a pairwise sequence identity 
of 35 % or less, one would expect to find not a single amino acid conserved 
in every sequence if all positions were equally mutable. In actual families of 
proteins, however, it is not unusual to find several residues that are absolutely 
conserved in dozens of distantly related sequences (the active site triad of the 
serine proteases, for example). Furthermore, by starting from aligned 
sequences with only one or two differences, Dayhoff and coworkers selected 
mutations occurring at the most mutable positions as the basis for their 
calculation. Since the most important features in alignments are those 
positions that are unusually conserved, it has been argued that scoring 
systems based on the chemical or structural properties of the amino acid 
residues may produce better alignments (for example, Kubota et aL, 1981; 
Feng and Doolittle, 1985; Argos, 1987;RisleretaL, 1988). Lastly, the matrix 
may be biased because it was derived using mainly small globular proteins 
as a basis. In spite of its flaws, the MDM„ table, or a scoring table derived 
from it, remains the only widely accepted means of scoring protein sequence 
alignments. The important position that the MDM,, table occupies in mo- 
lecular sequence analysis is clearly indicated by the fact that every new 
scoring system compares itself to the MDM, table as a standard. 

A special class of scoring systems known as metric distances merits 
additional consideration. A metric is a distance measure that has the 
following properties: 

• no distance is less than 0 

• identical sequence characters have a distance of 0 

• the distance is symmetric; that is, the distance from A to B is 
the same as the distance from B to A 



LOG-OL^S MATRICES 





A 


B 


C 


D 


E 


F 


G 


H 


I 


ic 


L 


M 


fi 


p 


Q | R 


c 
9 


T 
1 


1/ 

V 


it* 
w 


v 

A 


Y 


Z 


A 


2 






= 0- 




-4 


1 


-1 


-1 


•1 


•2 


• 1 


0 


1 


0 


-2 




| 


n 
u 


•O 


1 

* 1 


1 


A 


B 


o 


0 


-4 




2 


-5 


0 


1 


-2 


0 


.3 


.2 


2 


« 1 


t 


0 


ft 
u 


A 

u 


•2 


-5 


0 


•3 


0 




.2 


4 


12 


•5 


.5 


.4 


-3 


.3 


-2 


.5 


-6 


.5 


,4 


j 


•5 




U 


•2 


•2 


-8 


-3 


0 


-2 




o 




-5 


4 


3 


•6 


I 


1 


.2 


0 


.4 


.1 

j 


2 
* 


• 1 


2 


-1 


A 


A 

0 


•2 


-7 


-1 


-4 


1 


£ 


o 


2 


.5 


3 


4 


.5 


0 


] 


-2 


0 


.3 


•2 


1 




2 


•1 


A 
U 


A 


•2 


•7 


•1 


-4 


1 


C 

r 


*4 


.< 

*j 


.4 




.5 


9 


.5 


.2 


1 


.5 


2 


0 


.4 




•5 




.1 
' j 




•I 


0 


-2 


7 


-2 




i 


u 


1 


1 
1 


A 
IF 


< 


€ 

J 


jy 

"L 


•x 


.7 

"A 


^ 


* j 


n 
u 


t 


•1 


•3 


I 


A 


-I 


-7 


•2 


5 


0 


U 

n 


i 


i 


•J 


1 
1 


1 

.1 


.2 
** 


.2 






0 


.2 


•2 


2 


0 


3 


2 


.j 


.1 

"1 




-3 


0 


0 


0 


| 


,j 


„2 
•* 


.2 
•* 


.2 
•* 


_2 
** 


1 


.3 


.2 


5 


.2 


2 


2 


.2 


.2 


7 


** 


.1 


0 


4 


O 


• 1 


•1 


-1 


K 


.1 


o 


.5 


0 


0 


-5 


-2 


0 


-2 


5 


.3 


0 


1 


-1 


1 
1 


J 


0 


0 


•2 


.3 




A 


A 

u 


L 


-2 


.3 




.4 


.3 


2 




-2 


2 


.3 


6 


4 


.3 


.3 


-A 


-j 


-3 


.2 


2 


.2 


. | 


.1 


1 
* 1 


M 


i 


7 
** 


c 

0 


•J 


„2 

** 


A 
V 




_2 


* 


0 


4 


5 


.2 


.2 


-1 


0 


.2 


.| 


2 
* 


.4 


1 


7 
*A 


-1 


n 


A 
U 


7 




* 


1 


-4 


A 
V 


* 


.2 
•* 


1 
1 


.1 
•j 


** 


2 
* 


. j 


1 


0 


1 


0 


.2 




• 1 


7 

*A 


1 


o 

r 


1 


| 
•1 


•J 


1 

-1 


1 

-1 


c 

•J 


1 

•I 


A 
V 


•* 


■ 


•j 


-2 

"A 


.1 
•I 


A 

w 


A 


A 
U 


1 
1 


A 
V 


.1 
■ 1 


JL 


1 


< 


A 




A 
U 




c 

"J 


* 


2 


.5 


.J 


3 


.2 


I 


.2 


.] 


1 


0 


4 


1 
1 


.| 




-2 


•5 


.{ 




0 


D 


.1 
** 








,j 


.4 


.3 


2 


•2 


3 


.3 


0 


0 


0 


1 
1 


O 


0 


.1 


-2 


2 


.1 


-4 


0 


c 


1 
1 


A 
V 


A 
V 


A 


A 
W 


.1 

•J 




.j 


.| 


0 


.3 


.2 


I 


1 


1 

•1 


A 


2 


J 


-1 


•2 


.1 


.3 


0 


T 


I- 


A 
V 


.2 


0 


0 


.3 


0 


.1 


0 


0 


.2 


.] 


0 


0 




•l 


] 


3 


0 


.5 


.| 


• 3 


0 


V 


0 


-2 


-2 


-2 


-2 


-1 


-1 


-2 


4 


-2 


2 


2 


•2 


-1 


-2 


-2 


-1 


0 


4 


-6 


-1 


•2 


•1 


w 


-6 


-5 


-8 


-7 


-7 


0 


.7 


-3 


-5 


-3 


-2 


-4 


-4 


-6 


-5 


2 


-2 


-5 


-6 


17 


•3 


0 


•2 


X 


-1 


0 


-3 


•1 


-1 


•2 


-2 


0 


-I 


-1 


•1 


-1 


-I 


-1 


-I 




-1 


-1 


•1 


-3 


0 


•2 


0 


Y 


-3 


-3 


0 


-4 


-4 


7 


-5 


0 


•I 


-4 


-1 


•2 


-2 


-5 


-4 


-4 


-3 


•3 


-2 


0 


-2 


10 


-1 


0 


0 


-2 


1 


1 


-2 


0 


0 


-1 


0 


•1 


-1 


I 


0 


0 


0 


0 


0 


-1 


-2 


0 


-1 


0 



This table is the log-odds form of the mutational distance matrix at 250 P AM (percent accepted 
mutation) as calculated by Dayhoff and co-workers (Dayhoff, 1978). This scoring table is 
probably the most commonly used in protean sequence comparisons and is also known as the 
MDM^ table. When comparing two sequences, the value in the row corresponding to a residue 
in the first sequence and the column corre s ponding to a residue in the second sequence 
indicates how likely these residues are to have arisen from unrelated sequences. Specifically, 
the values are the log of the probability that the residues resulted from mutation of a common 
ancestor, divided by the probability that they are related by chance. Positive values therefore 
indicate residues that are more likely than chance to have a common ancestor, and negative 
values indicate that an evoloutionary relationship is less likely than chance. 
The P AM matrices are derived using a model of evolution wherein all positions are equally 
mutable, and are based on a specific set of observations of mutational frequency. For more 
details on the calculation of PAM matrices and their limitations see chapter 3 (Scoring 
Systems). 

Values for B and Z are the averages of values for D and N, and E and Q, respectively. X is 
the average value for all comparisons. 



BIOCHEMISTRY 



THIRD EDITION 




LUBERT STRYER 

STANFORD UNIVERSITY 



CB 

W. H. FREEMAN AND COMPANY / NEW YORK 



1tf 




Figure 2-2 

Electron micrograph of a cross sec- 
tion of insect flight muscle showing a 
hexagonal array of two kinds of pro- 
tein filaments. [Courtesy of Dr. Mi- 
chael Reedy.] 




■3 



■a 



Rgure 2-3 

Electron micrograph of a fiber of 
collagen. [Courtesy of Dr. Jerome 
Gross and. Dr. Romaine Bruns.] 



3. Coordinated motion. Proteins are the major component of muscle. 
Muscle contraction is accomplished by the sliding motion of two kinds 
of protein filaments. On the microscopic scale, such coordinated mo- 
tions as the movement of chromosomes in mitosis and the propulsion of 
sperm by their flagella also are produced by contractile assemblies con- 
sisting of proteins. 

4. Mechanical support. The high tensile strength of skin and bone is 
due to the presence of collagen, a fibrous protein. 

5. Immune protection. Antibodies are highly specific proteins that rec- 
ognize and combine with such foreign substances as viruses, bacteria, 
and cells from other organisms. Proteins thus play a vital role in distin- 
guishing between self and nonself. 

6. Generation and transmission of nerve impulses. The response of nerve 
cells to specific stimuli is mediated by receptor proteins. For example, 
rhodopsin is the photoreceptor protein in retinal rod cells. Receptor 
proteins that can be triggered by specific small molecules, such as ace- 
tylcholine, are responsible for transmitting nerve impulses at synapses — 
that is, at junctions between nerve cells. 

7. Control of growth and differentiation. Controlled sequential expres- 
sion of genetic information is essential for the orderly growth and dif- 
ferentiation of cells. Only a small fraction of the genome of a cell is 
expressed at any one time. In bacteria, repressor proteins are impor- 
tant control elements that silence specific segments of the DNA of a cell. 
In higher organisms, growth and differentiation are controlled by 
growth factor proteins. For example, nerve growth factor guides the 
formation of neural networks. The activities of different cells in multi- 
cellular organisms are coordinated by hormones. Many of them, such 
as insulin and thyroid-stimulating hormone, are proteins. Indeed, pro- 
teins serve in all cells as sensors that control the flow of energy and 
matter. 





Nerve growth 
factor 




Figure 2-4 

Photomicrograph of a ganglion showing the proliferation of nerves after addi- 
tion of nerve growth factor, a complex of proteins. [Courtesy of Dr. Eric 
Shooter.] 



PROTEINS ARE BUILT FROM A REPERTOIRE 
OF TWENTY AMINO ACIDS 



Amino acids are the basic structural units of proteins. An a-amino acid 
consists of an amino group, a carboxyl group, a hydrogen atom, and a 



18 

Part I 

MOLECULAR DESIGN OF LIFE 



Figure 2-8 

Amino acids having aliphatic side 
chains. 



coo- 

i 

r H 3 N — C — H 



Glycine 
(Gly, G) 



coo- 

I 

*H 3 N — C — H 
CH, 



Alanine 
(Ala, A) 



-H,N— C— COO- 

3 I 

H 3 C CH 3 



Valine 
(Val. V) 



coo- 

! 

*H,N — C — H 



CHv 

I 

CH 3 



Leucine 
(Leu/L) 



coo- 

I 

J 

"H 3 N — C — H 
H — C — CH 3 

CH 3 

Isoleucine 
(lie, I) 



Let us look at this repertoire of amino acids. The simplest one is 
glycine, which has just a hydrogen atom as its side chain (Figure 2-8). 
Alanine comes next, with a methyl group as its side chain. Larger hydro- 
carbon side chains (three and four carbons long) are fbund in valine, 
leucine, and isoleucine. These larger aliphatic side chains are hydrophobic — 
that is, they have an aversion to water and like to cluster. As will be 
discussed later, the three-dimensional structure of water-soluble pro- 
teins is stabilized by the coming together of hydrophobic side chains to 
avoid contact with water. The different sizes and shapes of these hydro- 
carbon side chains (Figure 2-9) enable them to pack together to form 
compact structures with few holes. 



Figure 2-9 

Models of aliphatic 
amino acids 




•H,N 



COO" 

v.; A 

2 — — C— H ' ^ 



0 H * 



Proline: 




IPro^P) U- 



Figure 2-10 

Proline differs from the other com- 
mon amino acids in having a second- 
ary amino group. 



Proline also has an aliphatic side chain but it differs from other mem- 
bers of the set of twen^ in that its side chain is bonded to both the 
nitrogen and a^rton atLorxis, The resulting cyclic structure (Figure 
2-10) markecUycinfluehces Proline, often found in 

the bends of Mcled protein chains, is not averse to being exposed to 
water. Note that: ^jaroUne contains a secondary rather than a primary 
amino group, which makes it an imino acid. 

Three amino acids with aroniate chains are part of the fundamen- 
tal repertoire (Figure 2-1 1). Pl^yUlaniiw, as its name indicates, con- 
tains a phenyl a methylene ( — CH 2 — ) group. Trypto- 
phan has an indole ning joined to methylene group; this side chain 
contains a nitrogen atom in addition to carbon and hydrogen atoms. 



coo- 

I 

+ H,N — C — H 
I 



+ H 3 N— C— H 



coo- 



+ H 3 N— C — H 



Figure 2-11 

Phenylalanine, tyrosine, 
and tryptophan have 
aromatic side chains. 





Phenylalanine 
<Phe,F) 



Tyrosine 

(Tyr. Y) 



Tryptophan 
(Trp, W) 



20 

Part I 

MOLECULAR DESIGN OF LIFE 




Figure 2-17 

Model of arginine. The planar outer 
part of the side chain, consisting of 
three nitrogens bonded to a carbon 
atom, is called a guanidinium group. 



COO" 

I 

•"H 3 N — C — H 

3 i 

CH 2 

CH 2 

I 

CH 2 
CH 2 
NH, + 



Lysine 
(Lys, K) 



COu- 

! 

'H 5 N — C — H 
CH 2 
CH 2 
CH 2 

N — H 

I 

C=NH 2 + 
NH 2 

Arginine 
(Arg. R). 



coo- 



^H 3 N— C— H 



CH 2 
C= 



=CH 



Histidtne 
(His, H) 



Figure 2^16 , -, 

Lysine, arginine, and hisudine~have basic side chains. 



sites of enzymes, where its imidazole ring can readily switch between 
these states to catalyze the making and breaking of bonds. These basic 
amino acids are depicted in Figure 2-16. The side chains of arginine and 
lysine are the longest ones in the set of twenty. 

The repertoire of amino acids also contains two with acidic side chains, 
aspartic acid and glutamic acid. These amino acids are usually called as- 
partate and gluiamate to emphasize that their side chains are nearly al- 
ways negatively charged at physiological pH (Figure 2-18). Uncharged 
derivatives of glutamate and aspartate are glutamin* and asparagine, 
which contain a terminal amide group in place of a carboxylate. 




Figure 2-19 

Model of glutamate. 



coo- 

I 

+ H 3 N — C — H 



CH 2 



Aspartate 



coo- 


coo^ 


coo- 

1 ■ 


1. 

HjN — — H 

<fH 2 
CH 2 

A- 


J 

+ H 3 N— C — H 


*HjN — C — H 


CH, 
0 A NH 2 


CH, 


Glutamate 

(Glu, E) 


Asparagin* 
(Asn. N) 


Glutamine 
(Gin, Q) 



Figure 2-18 

Addic amino acids (aspartate and glutamate) and their amide derivatives (aspar- 
agine and glutamine). 



Seven of the twenty amino acids have readily ionizable side chains. 
Equilibria and typical pK z values for ionization of the side chains of 
arginine, lysine, histidine, aspartic and glutamic acids, cysteine, and ty- 
rosine in proteins are given in Table 2-1. Two other groups in proteins, 
the terminal a-amino group and the terminal cr-carboxyl group, can be 
ionized. 

Amino acids are often designated by either a three-letter abbrevia- 
tion or a one-letter symbol to facilitate concise communication (Table 
2-2). The abbreviations for amino acids are the first three letters of 
their names, except for tryptophan (Trp), asparagine (Asn), glutamine 
(Gin), and isoleucine (He). The symbols for the small amino acids are 
the first letters of their names (e.g., G for glycine and L for leucine); the 
other symbols have been agreed upon by convention. These abbrevia- 
tions and symbols are an integral part of the vocabulary of biochemists. 




Oligonucleotide-mediated Mutagenesis 



Oligonucleotide-mediated mutagenesis is used to add, delete, or substitute 
nucleotides in a segment of DNA whose sequence is known^flnr contrast to 
most other methods of mutagenesis, which typically spawn mi^d ^epilations 
of variants, oligonucleotide-mediated mutagenesis specifically generates mu- 
tations designed by the experimenter. Because of this precision, the -method 
can be used, for example, to alter individual codons in protein coding 
sequences or to generate defined changes in sequences that have a regulatory 
function. In addition, oligonucleotide-mediated mutagenesis can^be used to 
facilitate construction of new vectors and- chimeric genes. Far Example, 
sequences such as ribosomerbinding sites or polyadenylation signMs ^can be 
inserted at predetermined positions in expression vectors; inconvenient re- 
striction sites can be removed and convenient sites can be added at specified 
pdsitibns; M ahd, finally, undesirable sequences (e.g., introns and DNA se- 
-qbleiides^'tK&t'cbide for untranslated regions of ihRNA) can be eliminated arid 
different domain and coding regions) can be linked together 

1 with precision. 

The feasibility of introducing specific changes at defined locations in DNA 
was first recognized in the early 1970s from work aimed at mapping the 
locations of mutations on the single-stranded genome of the small bac- 
teriophage <jf>X174. When fragments of denatured wild-type bacteriophage 
DNA were transfected into susceptible bacteria together with intact single- 
stranded bacteriophage DNA carrying an amber mutation, "marker rescue" 
was observed, i.e., bacteriophages carrying wild-type genomes were gener- 
ated. Marker rescue; occurred in the transfected bacteria b||ause the 
fragment of wild-type DNA annealed to the corresponding s^qui nee of the 
' amber mutant, forming a mismatched heteroduplex that • was converted by 
host-specified mismatch-repair systems into a full-length, wild-type genome. 
It was quickly realized that this process could also be used in reverse, i.e., 
that specific mutations could be introduced into wild-type DNA using mu- 
tated double-stranded fragments of viral DNA (Weisbeek and^|^j;;de|;Pol 
1970; Hutchison and Edgell 1971). Later, when pioneering wSi^'in SNA 
chemistry had led to the routine^ synthesis of oligonucleotides (Letsinger and 
Lunsfqrd 1976; Khorana 1979; Mattered and Caruthers 1981), and when the 
aymlability and quality of DNA r modifying enzymes (DNA polymerase and 
DNA ligase) had improved, in vitro techniques for oligonucleotide-mediated 
DNA mutagenesis were developed. The first methods used synthetic oligonu- 
cleotides that were completely homologous to single-stranded bacteriophage 
<£X174 DNA except for a single base change that, if incorporated into the 
bacteriophage genome, would generate a selectable phenotype. The oligonu- 
cleotides were annealed to single-stranded bacteriophage 0X174 DNA and 
used as primers for DNA synthesis catalyzed in vitro by the Klenow fragment 
of E. coli DNA polymerase I. When the resulting heteroduplexes were 
transfected into bacteria, a dramatic increase was observed in the {Te W e ™* 
of bacteriophages displaying the desired phenotype (Hutchison et ai. i*/o» 
Razin et al. 1978). 



Site-directed Mutagenesis of Cloned DNA 



Preparation of Single-stranded Target DNA 

All oligpnucleotide-mediated mutagenesis procedures require a target DNA 
that is at least partially single-stranded. This can be prepared simply and 
efficiently by cloning the target DNA into bacteriophage Ml 3 or into recombi- 
nant plasmids (phagemids) containing origins of replication derived from 
single-stranded bacteriophages (see Chapter 4). There are good reasons why 
the segment of target DNA cloned into these vectors should be as small as 
possible: 

• Large segments of DNA can be unstable in si^gle-str^ded bacteriophage 
vectors and are prone to suffer spontMeou^ deletion. ~ 

• The chance that the mutagenic oligonucleotide will hybridize to an inap- 
propriate site rather than to a desired sequence increases as the size of the 
target DNA increases. 

• To ensure that the mutagenized target DNA contains only the desired 
mutation and no other, it is essential to sequence the entire fragment after 
oligonucleotide-mediated mutagenesis has been completed. The shorter the 
target DNA, the easier the task of determining its entire sequence. 

In most instances, naturally occurring restriction sites can be used to insert 
an appropriately sized segment of target DNA ( < 1 kb) into a single-stranded 
DNA vector. In a few cases, however, when no suitable restriction sites are 
present, it may be necessary to consider other options. These include: 

• Cloning a fragment of target DNA that is larger than optimal. 

• Carrying out preliminary oligonucleotide-mediated mutagenesis to intro- 
duce restriction site(s) at appropriate locations in the target DNA. This is 
worthwhile when many oligonucleotide-mediated mutations are to be intro- 
duced into the same region of DNA, for example, when generating a 
comprehensive set of substitutions of a particular amino acid. Under such 
circumstances, it is usually possible to take advantage of the degeneracy of 
the genetic code to introduce novel restriction site(s) upstream of or down- 
stream from the target area without altering the sequence of amino acids 
for which the DNA codes. The resulting "cassette" of DNA can then be 
easily shuttled in and out of the bacteriophage Ml 3 vector used for 
oligonucleotide-mediated mutagenesis. 

• Avoiding bacteriophage M13 vectors altogether. Mutagenesis is then car- 
ried out using double-stranded DNA derived from a plasmid. A number of 
methods have been developed to avoid the use of single-stranded vectors, 
but all of them are comparatively inefficient and should only be used in 
desperation. These methods include: (1) exonucleolytic digestion of plasmid 
DNA that has been nicked at a specific site (Wallace et al. 1981a; Efimov et 
al. 1985), (2) denaturation of supercoiled DNA (Schold et al. 1984), and 
(3) formation of heteroduplexes between two linear DNA fragments such 
that the resulting molecule is circular and has a single-strand gap that in- 
cludes the target site for mutagenesis (Oostra et al. 1983; Morinaga et al. 
1984). 



Site-directed Mutagenesis of Cloned DNA IS J 



1985) or Sequenase™ (Schena 1989), which cannot readily remove the 
hybridized mutagenic primer from its templates. 

• The tfybrid formed between the template and the 3' terminus of the 
oligonucleotide is sufficiently stable to allow priming of DNA synthesis 
to occur with high efficiency. If the mismatched nucleotide is too close to 
ttife 3' terminus, the 3' segment of the oligonucleotide will be unable to 
form a stable hybrid with the target DNA and will therefore be suscep- 
tible to exonucleolytic degradation if the Klenow fragment of E. coli DNA 
polymerase I is used in the primer-extension reaction (Gillani and Smith 
1979). In addition, an increase in the frequency of priming at incorrect 
locations might occur because the unhybridized 3' region of the 
mutagenic oligonucleotide is now free to anneal to incorrect sites on the 
template. To suppress these effects, between 7 and 9 perfectly matched 
nucleotides are required at the 3' terminus of the mutagenic oligonu- 
cleotide. 

• The difference in thermal stability between perfectly matched hybrids 
and mismatched hybrids is sufficiently great that the mutagenic oligonu- 
cleotide can be used to screen bacteriophage M13 plaques by hybridiza- 
tion for potential mutants. As discussed in Chapter 11, the longer the 
oligonucleotide, the smaller the difference in thermal stability between a 
perfectly matched hybrid and one containing a single mismatched base 
pair. The aim, therefore, is; to use the shortest mutagenic oligonu- 
cleotide that, under the conditions used for primer extension, will form 
stable hybrids both upstream of and downstream from the mismatch. 
Under normal circumstances, the mutagenic oligonucleotide should be 
between 17 and 19 nucleotides in length, with the mismatch centrally 
located. 

2. Oligonucleotides used to create deletions or insertions or to substitute two 
or rrwre contiguous nucleotides. Oligonucleotides 25 or more nucleotides in 
length are used to insert, delete, or substitute two or more bases. Opti- 
mally, there shouicl be 12-15 perfectly matched nucleotides on either side 
of the central looped-out region to ensure that both ends of the mutagenic 
oligohucieotide are stably hybridized at the temperature used for primer 
extension. The thermal stability of each of the two flanking regions can be 
estimated from the following formula: 

T m = 4(G + C) + 2(A + T) 

where T m = melting temperature (in 6 x SSC), G = number of G residues 
in the sequence, C = number of G residues in the sequence, A = number of 
A residues in the sequence, and T = number of T residues in the sequence. 
Mutagenesis occurs efficiently when the T m of each of the double-stranded 
regions flanking the mismatched or looped-out sequence is approximately 
35-40°C. 

In general, the larger the size of the mutation to be constructed, the 
lower the efficiency of oligonucleotide-mediated mutagenesis. This inef- 
ficiency stems from two sources. First, the ability of the mutagenic 
oligonucleotide to form stable hybrids with two separate sequences on the 



Site-directed Mutagenesis of Cloned DNA 15*55 



Hybridization of Oligonucleotides to the Template UNA 
ami Primer Extension 

In the standard double-primer method (Norris et al. 1983; Zoller and Smith 
1984, 1987) (see Figure 15.7), two oligonucleotides — the phosphorylated 
mutagenic oligonucleotide and a universal sequencing primer (which need not 
be phosphorylated) — are mixed in a 10- to 50-fold molar excess with the 
single-stranded template DNA in a small volume of buffer. The template and 
oligonucleotides are heated briefly to 20°C above the estimated T m , to 
denature any regions of secondary structure, and then cooled slowly to room 
temperature. Hybrids form as the temperature of the reactiommixture drops 
below the relevant 5T^ " The stoichiometry ?of the reagents in the mixture 
ensures thatlartually all drtKe^smgle-stranded iem is driven into 

hybrids with both oligonucleotides. 

This protocol works well for mutagenic oligonucleotides of a wide variety of 
lengths and base compositions. However, annealing mixtures containing 
mutagenic oligonucleotides that are exceptionally rich in A + T may need to 
be cooled to lower temperatures (12-16°C) in order to stabilize the hybrids. 

After the annealing reaction is complete, a mixture containing a DNA 
polymerase, dNTPs, DNA ligase, and ATP is added, and primer extension is 
allowed to proceed for 2-15 hours at the appropriate temperature for the 
polymerase being used. DNA synthesis is initiated both at the 3' terminus of 
the mutagenic oligonucleotide and at the 3' terminus of the upstream 
universal sequencing primer. The sequencing primer is extended by DNA 
polymerase until the 3' terminus of the newly synthesized strand encounters 
the 5' terminus of the phosphorylated mutagenic oligonucleotide (see Figure 
15.7). The two segments of DNA then become ligated by the action of 
bacteriophage T4 DNA ligase. This sealing of the phosphodiester bond 
protects the 5' terminus of the mutagenic oligonucleotide from displacement 
by the newly growing strand and prevents 5' exonucleolytic editing of 
mismatched nucleotides after the DNA is transfected into E. colL The 3' 
terminus of the mutagenic oligonucleotide is also extended, resulting in the 
formation of a mutant wild-type heteroduplex (see Figure 15.7) that is 
transfected into the E. coli. 

Any of several different DNA polymerases may be used in the extension 
reaction. Until recently, most workers relied exclusively on the Kleriow 
fragment of E. coli DNA polymerase I, which lacks 5' exonucleolytic activity 
and is therefore incapable of degrading the mutagenic oligonucleotide. Al- 
though the efficiency of mutagenesis is reported to be higher when bac- 
teriophage T4 DNA polymerase (Geisselsoder et al. 1987) or Sequenase 
(Schena 1989) is used, the Klenow fragment generally gives yields of mutants 
that are more than adequate, and this remains the enzyme of choice. 



Site-directed Mutagenesis of Cloned DNA 15.57 



Transfection of E. coli and Screening for Mutants 

After the primer-extension reaction is complete, the resulting mixture of 
double-stranded heteroduplex DNA is then transfected directly into an appro- 
priate bacterial host. Most of the transfected cells release bacteriophage 
particles that carry a wild-type copy of the target fragment. However, a 
portion of the transfected cells generate particles whose genomes carry the 
desired mutation. Mutants can therefore be identified by screening plaques 
by hybridization, using the radiolabeled mutagenic oligonucleotide as a 
probe. 

Earlier protocols (see, e.g., Zoller and Smith 1982, 1983), in which the 
extension reaction was primed by a single, phosphorylated, mutagenic 
oligonucleotide, called for the piuification of cpvalently closed circular DNA 
before transfection. The aim of this step was to reduce the background of 
plaques containing ^yild-type target DNA. However, this time-consuming 
enrichment procedure is no longer necessary because of the improvement in 
efficiency of mutagenesis brought about by the use , of double primers (see 
page 15.57). In addition, advances in the understanding of the hybridization 
properties of small oligonucleotides (reviewed in Chapter 11) have made it 
possible to screen rapidly and simultaneously many thousands of plaques to 
identify those generated by bacteriophages carrying the desired mutation. 
Hybridization is, usually carried pjut under conditions tltet allow the 
radiolabeled oligonucleotide to form hybrids with both mutant and wild-type 
DNA. By progressively increasing the stringency of the subsequent washes, 
it is almost always possible to find conditions that (1) cause dissociation of 
mismatched hybrids, such as those formed between the mutagenic oligonu- 
cleotide and wild-type DNA, and (2) do not dissociate perfect hybrids formed 
by the oligonucleotide and the desired mutant. 



Site-directed Mutagenesis of Cloned DNA 15.50 



Methods to Improve the Efficiency of Oligonucleotide-mediated 
Mutagenesis 

The baslfr procedures described above have been used successfully for several 
years to isolate a wide variety of site-directed mutants. Occasionally, 
however, difficulties are encountered in obtaining a particular mutation by 
the standard procedure. These difficulties have several causes: 

• The nature of the mutation itself. The larger and more complex the muta- 
tion, the lower the efficiency with which it will be generated. For example, 
lai'f deletions U.^ are generated 

~ mtfr^ involving only 

local changes in sequence. This decrease almost certainly occurs because 
the mutagenic oligonucleotide has problems in forming and maintaining 
stable hybrids with widely separated tracts of target DNA. 

• The nature of the target sequences. Regions of single-stranded target DNA 
with a high propensity to form stable secondary structures (hairpin loops, 
stem loops, etc.) are difficult to mutagenize, presumably because such 
structures reduce the efficiency of annealing of the mutagenic oligonu- 
cleotide. A related problem may arise when the target DNA consists of, or 
contains, repeated sequences. In these cases, the mutagenic oligonucleotide 
may be able to anneal to sequences present at more than one location in the 
target DNA and may generate additional mutations at these sites. Thus, 
only a fraction of the plaques that hybridize to the oligonucleotide probe 
may actually carry the desired mutation. 

• The nature of the vector. Oligonucleotide-mediated mutagenesis is general- 
ly earned out with templates generated from recombinant M13 bac- 
teriophages. However, single-stranded templates can also be obtained by 
superinfection of bacteria that have been transformed with plasmids 
(phagemids) carrying an origin of DNA replication derived from a single- 
stranded bacteriophage vector (see Chapter 4). Although the phagemid 
system bypasses two time-consuming steps — cloning the fragment of target 
DNA into a bacteriophage M13 vector and recovering it after muta- 
genesis — the overall efficiency of mutagenesis is reduced by approximately 
five- to tenfold. This inefficiency results largely from variation in yield of 
single-stranded DNA from one superinfection experiment to the next and 
from one phagemid clone to another. Although the phagemid system can 
generally be used to isolate simple mutants (containing point mutations 
and small deletions or insertions), bacteriophage M13 vectors are preferred 
when constructing more complicated mutants. 

To improve the efficiency with which such recalcitrant mutants can be 
isolated, a large number of variations of site-directed mutagenesis have been 
described (see, e.g., Marmenout et al. 1984; Bauer et al. 1985; Carter et al. 
1985; Smith 1985; Taylor et al. 1985; Carter 1987; Zoller and Smith 1987). 
The best of these methods, however, was developed by Kunkel (1985; Kunkel 
et al. 1987) (see page 15.74) and takes advantage of a strong biological 
selection that can be applied against the wild-type strand of DNA used as 
template in oligonucleotide-mediated, site-directed mutagenesis. Because of 



Site-directed Mutagenesis of Cloned DNA 15.61 



OUGONUCLEOTIDE-MEDIATED MUTAGENESIS BY THE DOUBLE- 
PRIMEB METHOD 

The following method is modified from Zoller and Smith (1987). 

1. In preparation for mutagenesis, clone a small fragment of DNA carrying 
the target sequence into an appropriate bacteriophage M13 vector (usually 
bacteriophage M13mpl8 or M13mpl9). Prepare single-stranded template 
DNA from a plaque generated by the recombinant bacteriophage. 
Methods for cloning into bacteriophage M13 vectors and for preparation of 
single-stranded bacteriophage DNA are given in Chapter 4. 

Tlie mutagenic oligonucleotide shbuld be designed as described on pages 
15.54-15.56 and should be complementary to the strand of the target 
DNA that is packaged in bacteriophage M13 particles [the ( + ) strand]. 
Before use in site-directed mutagenesis, the mutagenic oligonucleotide 
should be purified by Sep-Pak C IS column chromatography to remove salts 
and other impurities (see Chapter 11, page 1L39). However, it is 
generally not necessary to; purify ithe oligonucleotide by polyacrylamide gel 
electrophoresis unless the oligonucleotide is more than 30 nucleotides in 
length or is to be used for "loop-in" or "loop-out" mutagenesis. 

2. Phosphorylate the mutagenic oligonucleotide with bacteriophage T4 poly- 
nucleotide kinase. Mix: 

mutagenic oligonucleotide 100-200 pmoles 
H 2 0 to 16.5 fil 

10 x bacteriophage T4 polynucleotide 

kinase buffer 2 jii 

10 mM ATP 1 fil 

bacteriophage T4 polynucleotide kinase 4 units 

Incubate the reaction for 1 hour at 37°C, and then heat the reaction for 
10 minutes at 68°C to inactivate the polynucleotide kinase. 

10 x Bacteriophage T4 polynucleotide kinase buffer 

0.5 m Tris • CI (pH 7.6) 
0.1 m MgCl ? 
50 mM dithiothreitol 
1 mM spermidine HC1 
1 mM EDTA (pH 8.0) 



3. Anneal the phosphorylated mutagenic oligonucleotide and nonphosphory- 
lated universal sequencing primer to the single-stranded bacteriophage 
M13 DNA containing the target sequence. Mix: 

single-stranded template DNA (-1 tig) 0.5 pmole 

phosphorylated mutagenic oligonucleotide 10 pmoles 

nonphosphorylated universal sequencing primer 10 pmoles 

10 x PE1 buffer 1 ^1 
H 2 0 to 10 fil 



Site-directed Mutagenesis of Cloned DNA 15*63 



able to displace the mutagenic oligonucleotide from its template. Use dNTPs of the 
highest quality to minimize the possibility that contaminating dUTP might be 
incorporated into the newly synthesized strand of DNA. The concentrated dNTP 
solutions sold by Pharmacia have worked well in our hands. 

When using bacteriophage T4 DNA polymerase or Sequenase, the extension reac- 
tion should be incubated for 5 minutes at 0°C, 5 minutes at room temperature, and 
then 2 hours at 37°C. The low temperature optimizes initiation of DNA synthesis 
from the 3' terminus, and the subsequent incubation at 37°C improves the efficiency of 
the extension reaction. In addition, the concentration of each of the four dNTPs in the 
reaction should be increased to 500 /xM when using these polymerases. This increases 
the efficiency of the extension reaction and suppresses the strong 3' exonuclease 
activity of bacteriophage J4^D^A polymer^e. ^ 

5. ^d^T^T^fthe ice^oTS^ETmixture tolHe reactiorFmixture containing 
single-stranded DNA and annealed oligonucleotides (step 3). Incubate the 
final reaction mixture for 6-15 hours at 16°C. 

6. Transfect competent E. coli of an appropriate host strain (e.g., TGI) as 
follows: 

a. Make a series of dilutions of the reaction mixture (1:10, 1:100, and 
1:500) in 10 mM Tris • CI (pH 7.6). 

b. In Falcon 2059 tubes, precooled to 0°C, combine 1- and 5-/xl aliquots of 
the undiluted reaction mixture and of each dilution of the reaction 
mixture with 200- /il aliquots of competent TGI cells (prepared as 
described in Chapter 1, pages 1.76-1.81). 

c. Store the mixtures on ice for 30 minutes, and then transfer them for 
exactly 2 minutes to a water bath equilibrated at 42°C. 

d. Remove the transfected cultures from the water bath, and mix each of 
them with 100 ^1 of an overnight culture of TGI cells. 

There is no need to add TGI cells if freshly prepared, rather than frozen, 
competent cultures are used. 

e. Add 2.5 ml of 2 x YT top agar (melted and cooled to 45°C) to each 
culture, and plate each mixture on a separate YT agar plate. Incubate 
the plates for 16 hours at 37°C to allow plaques to form. 

- If mutagenesis is carried out by the Kunkel method (see pages 15.74-15.79), l-/il and 
5-/U.1 aliquots of the undiluted reaction mixture are used to transfect competent 
cultures of E. coli strain TGI. 

If single-stranded DNA derived from a phagemid such as pUC118 or pUC119 is used 
as the template for mutagenesis, 1-jil and 5-/il aliquots of the undiluted reaction 
mixture are used to transform competent cultures of E. coli strain MV1184 (see 
Chapter 4, page 4.15, for a description of this strain and Chapter 4, pages 4.37-4.38, 
for a description of the transformation protocol). Plate 50-ftl and 100-mI aliquots of 
each transformation mixture onto LB agar plates containing 50 fig/rol ampicillin. 
Ampicillin-resistant colonies should appear after 18—24 hours of incubation at 37°C. 
Screen the transformed colonies as described on pages 15.72-15.73. 

7. Screen plaques (pages 15.68-15.71) or colonies (pages 15.72-15.73) by 
hybridization with a radiolabeled oligonucleotide probe to detect putative 
mutants. 



Site-directed Mutagenesis of Cloned DNA 15*65 



c. Wrap the strip of polyethyleneimine-cellulose in Saran Wrap and 
autoradiograph, or cut the strip horizontally into thin (0.25-cm) sec- 
tions and measure the amount of radioactivity in each section in a 
scintillation counter. 

Oligonucleotides will remain at the origin, whereas ATP and inorganic phosphate will 
migrate in the same direction as the solvent (inorganic phosphate migrates slightly 
slower than the solvent front and ATP is approximately equidistant between the 
origin and the inorganic phosphate). Thus, the transfer of phosphate from [-y- 32 P]ATP 
to the oligonucleotide will result in the appearance of radioactivity at the origin. By 
measuring the amount of radioactivity at the origin and on the total strip, the 
percentage of radiolabel transferred from [y- 32 P]ATP to the oligonucleotide can be 
calculated. The specific activity of the probe can then be determined on the basis of 
the molar quantities of oligonucleotide and [y- 32 P]ATP in the reaction. Under the 
conditions described above, approximately 50% of the radioactivity should be trans- 
ferred to the oligonucleotide, resulting in a specific activity of approximately 2500 
Ci/mmole. 

The transfer of radiolabel to the oligonucleotide can also be monitored by adsorption to 
DE-81 filters. The oligonucleotide binds tightly to the positively charged filters, 
whereas unincorporated radiolabel is removed by repeated washing with a solution of 
sodium phosphate. See Appendix E for details. 

4. (Optional) Remove unincorporated radiolabel from the oligonucleotide by 
one of the methods described in Chapter 11, pages 11.33-11.39. 

Generally, this step is necessary only when background hybridization is a persistent 
problem. Under normal circumstances, the unpurified reaction mixture may be used 
as a probe. 



Site-directed Mutagenesis of Cloned DNA 15.67 



Hybridization with the radiolabeled oligonucleotide should be performed, at a tem- 
perature 5-10°C below the estimated from the following formula: 

'tri = 4(G + C).+ 2(A + T) 

4. At the end of the hybridization period, cut off a corner of the plastic bag 
and pour the hybridization solution into a disposable plastic tube. Seal 
the tube, and store the solution at -20°C until it is needed for rescreen- 
ing positive plaques^ (steps 10 and 16). 

5. Quickly transfer the filters to a tray containing 20(^300 ml of 6 x SSC 
at room temperature. Cover the tray with Saran Wrap, and place it on a 
rotating shaker for 15 minutes. Replace the washing fluid every 5 
minutes. 

6. Quickly transfer the filters to a piece of S^ran Wrag stretched on the 
bench. Do not allow the filters to dry. Cover the filters with another 
piece of Saran Wrap. Fold the edges of the two pieces of Saran Wrap 
together to form a tight seal. Apply adhesive dot labels marked with 
radioactive ink to the outside of the package and establish an au- 
toradiography by exposing the package of filters to X-ray film for 1-2 
hours at -70°C, using an intensifying screen (see Appendix E). 

Radioactive ink is made by mixing a small amount of 32 P with waterproof black 
drawing ink. We find it convenient to make the ink in three grades: very hot 
( > 2000 cps on a hand-held minimonitor), hot ( > 500 cps on a hand-held 
minimonitor), and cool ( > 50 cps on a hand-held minimonitor). Use a fiber-tip pen to 
apply ink of the desired hotness to the adhesive labels. Attach radioactive warning 
tape to the pen, and store it in an appropriate place. 

7. Compare the pattern of hybridization with the distribution of plaques. 
At this stage, it is normal to find that virtually every plaque hybridizes to 
the probe. Typically, however, some plaques hybridize more strongly 
than others, and these often turn out to be those that carry the desired 
mutation. 

8. Transfer the filters to a plastic box containing 100-200 ml of 6 x SSC 
that has been prewarmed to T m - 10°C. Agitate the filters in the solution 
for 2 minutes, and then transfer them to a piece of Saran Wrap as 
described in step 6. Establish another autoradiograph. At this stage, it 
is often possible to identify two types of plaques: those whose radioactive 
signal has decreased in intensity and those that show no change in 
intensity. 

To minimize dissociation of perfect hybrids formed between the radioactive oligonu- 
cleotide and the mutagenized target sequence, do not wash the filters for more than 
2 minutes. 

9. Repeat the cycles of washing and autoradiography, increasing the tem- 
perature of the 6 x SSC washing solution by 2-3°C in each cycle. The 
aim is to find a temperature that does not markedly affect perfect 
hybrids but causes dissociation of mismatched hybrids (such as those 



Site-directed Mutagenesis of Cloned DNA 15.69 



14. Isolate .bacteriophage M13 replicative form DNA from a culture infected 
with plaque-purified recombinant bacteriophages (step 11) that carry the 
desired -mutation and show no other changes in sequence in the target 
region. Methods to isolate and purify bacteriophage M13 replicative form 
DNA are given in Chapter 4. 

15. Recover the mutated target sequence by digestion of bacteriophage M13 
replicative form DNA with the appropriate restriction enzyme(s) and 
preparative gel electrophoresis. Redone the target DNA in the desired 
vector. 

16. Using a number of different restriction enzymes, digest aliquote of either 
a recombinant that carries the original (unmutageriized) target sequence 
or the recombinant (constructed in step 15) that carries mutagenized 
target sequence- Separate the resulting fragments by gel electrophoresis, 
and transfer them to a solid support (e.g., nitrocellulose filter or nylon 
membrane) as described in Chapter 9: £arry out Southern hybridization 
at T m - 10°C, using the 32 P-labeled mutagenic oligonucleotide as probe. 
Wash the filter under the idisfcriminatoiy conditions established in step 9 
and autoradiograph: TK6 final autdradiograph should show hybridization 
offly ta mutagenized target DNA. 



Site-directed Mutagenesis of Cloned DNA 15*71 



step 6, page 15.65). Phagemid DNA is isolated from the pooled colonies and 
used to transform another batch of competent MV1184 cells. The resulting 
colonies,- which will contain pure populations of either mutant or wild-type 
phagemids, are then screened by hybridization, using the mutagenic oligonu- 
cleotide as probe. 



Site-directed Mutagenesis of Cloned DNA 15.73 




Generate single-stranded template 
DNA by growing recombinant 
bacteriophage M13 in a dur 
ung~ f strain of E. coii 




Template contains uracil residues 
instead of thymine (20-30 uracil 
residues/template) 



phosphorylated mutagenic p 
oligonucleotide 



3' 




DNA polymerase 
four dNTPs 
DIM A ligase 




Transfect wild-type 
E. cofi 




Uraci (-substituted strand 
of DNA is degraded 



plaques generated 
by strand of DNA 
synthesized in vitro 



FIGURE ISA 

Oligonucleotide-mediated, site-directed mutagenesis using the Kunkel method (see 
text for details). 



Site-directed Mutagenesis of Cloned DNA 15»75 



Chapter 4, there is a possibility that deleted variants will outgrow the original 
recombinant during extended periods of incubation. It is therefore advisable to 
verify that the majority of the single-stranded DNA used as template is of the 
* borrect size. Methods to analyze the size of bacteriophage M13 DNA by gel 
electrophoresis are described in Chapter 4, pages 4.39-4.40. It is also essential to 
sequence the entire segment of foreign DNA after mutagenesis to ensure that no 
deletions or other types of mutations have occurred at sites other than the 
immediate target sequence. 

6. Measure the volume of the bacteriophage suspension, and then add 0.25 
volume of NaCl/PEG solution. Mix the contents of the centrifuge bottle 
by swirling, and then store the bottle on ice for 1 hour. 

NaCl/PEG solution 

15% w/v polyethylene glycol (PEG 8000) 
2.5 m NaCl 

7. Recover the precipitated bacteriophage particles by centrifugation at 
5000# for 20 minutes at 4°C. Remove the supernatant by aspiration, and 
then invert the bottle to allow the last traces of supernatant to drain 
away. Use a pipette attached to a vacuum line to remove any drops of 
solution adhering to the walls of the bottle. 

8. Resuspend the bacteriophage pellet in 4 ml of TE (pH 7.6). Transfer the 
suspension to a 15-ml Corex centrifuge tube, and wash the walls of the 
centrifuge bottle with another 2 ml of TE (pH 7.6). Transfer the washing 
to the Corex tube. Vortex the suspension vigorously for 30 seconds, and 
then store the tube on ice for 1 hour. 

9. Vortex the suspension vigorously for 30 seconds, and then centrifuge it 
once more at 5000g for 20 minutes at 4°C in a fixed-angle rotor (e.g., 
Sorvall SS34). 

10. Taking care not to disturb the pellet of bacterial debris, transfer the 
supernatant to a polypropylene tube. Extract the suspension twice with 
phenol equilibrated to pH 8.0 (see Appendix B) and once with phenol: 
chloroform. Separate the phases by centrifugation at 400Qg for 5 minutes 
at room temperature. Avoid transferring material from the interface. 

11. Transfer the aqueous phase from the final extraction to a glass centrifuge 
tube (e.g., a 30-ml Corex tube). Measure the volume of the solution, and 
add 0.1 volume of 3 m sodium acetate (pH 5.2), followed by 2 volumes of 
ethanol at 0°C. Mix the contents of the tube thoroughly, and then store 
the tube on ice for 30 minutes. 

12. Recover the DNA by centrifugation at 5000# for 20 minutes at 4°C. 
Carefully remove the supernatant. Add 10 ml of 70% ethanol at room 
temperature, vortex briefly, and recentrifuge. 



Site-directed Mutagenesis of Cloned DNA 15.77 



Because growth of unmutated bacteriophages is suppressed with high 
efficiency, there is often no need to screen plaques by hybridization, 
liistead, single-stranded DNA can be prepared from a number of 
well-isolated plaques and analyzed directly by DNA sequencing. 



Site-directed Mutagenesis of Cloned DNA 15*79 



Using Mutagenesis to Study Proteins 



Until a few years ago, only two methods were available to study directly the 
relationship between the structure and function of a protein: (1) chemical 
modification of the side chains of the amino acids that form the primary 
sequence of the protein and (2) X-ray diffraction of protein crystals. Although 
both of these methods have yielded much information, they can be used only 
with proteins that are available-in-large^quantity and are of high purity. A 
less direct type of analysis may be used if naturally occurring mutations can 
be identified that map in the gene coding for the protein of interest and that 
generate an observable phenotype. The mutant genes can then be cloned, 
and their DNA sequences can be compared to that of the wild -type allele. 
This approach has been invaluable in identifying mutations that affect the 
function of proteins such as the human LDL receptor (Davis et al. 1986) and 
factor VIII (Gitschier et al. 1988). However, this method also has its own set 
of restrictions: The mutations are sometimes confined to specific domains of 
the protein, and it is often difficult to isolate a sufficient number of indepen- 
dent mutants to allow definitive conclusions to be drawn about the structure 
of the protein. 

These constraints can be partially circumvented by using site-directed 
mutagenesis to introduce mutations at predetermined sites in a cloned cDNA 
and then expressing the altered gene in an appropriate host-cell /vector 
system. By comparing the properties of the mutant and wild-type forms of 
the protein, it is often possible to identify domains or individual amino acid 
residues that are essential for the structural integrity and/or biological 
function of the protein. Because of the rapid advances that have occurred in 
recent years in both site-directed mutagenesis and expression of cloned 
genes, this method, which is sometimes called "reversed genetics," is often 
the first approach used by molecular biologists to analyze the relationship 
between a protein's structure and its function. The major problem with 
reversed genetics, however, is how to distinguish mutations that affect local 
structures from those that have profoundly deleterious effects on the folding 
or stability of the entire protein. Consider a typical experiment in which a 
number of point mutations have been generated at various sites in a gene 
coding for an enzyme. When the activities of these mutants are assayed, 
some of them show a reduction in catalytic function and others do not. In the 
absence of any other data, it is not possible to draw firm conclusions about 
the structure of the enzyme from this result. There is no way to know 
whether the substitution of one amino acid for another has affected only the 
structure and function of the active site or whether it has had more global 
effects. The problem would remain even if the three-dimensional shape of the 
wild-type enzyme were known. No algorithms have been yet devised that 
accurately predict the perturbations in protein structure caused by substitu- 
tion, addition, or deletion of amino acid residues. 

These difficulties can be alleviated by developing independent assays for 
the folding of the protein of interest. Such assays typically include: 



Site-directed Mutagenesis of Cloned DNA 15*81 



2. Search the data banks for proteins with homologous amino acid sequences. 
Residues that are highly conserved between proteins of different function 

- are more likely to be involved in forming common structural motifs than in 
highly specific and private activities such as catalysis or ligand binding. 

3. Use a computer program to analyze whether the amino acid sequence 
contains internal repetitions. These indicate that the gene coding for the 
protein has evolved by one or more rounds of duplication followed by 
mutagenic drift. Residues that are conserved between different repetitive 
elements are more likely to be involved in forming common structural 
motifs than in catalysis or ligand binding. 

4. When planning to make deletion mutants, avoid the temptation to remove 
the DNA sequences that lie between naturally occurring restriction sites. 
Although such deletions are easy to make, their borders are unlikely to lie 
at sensible positions within the coding sequence and the resulting proteins 
are almost certain to be malfolded. 

5. If the positions of introns and exons have been established by analysis of 
the corresponding segment of genomic DNA, consider the possibility of 
creating a set of deletion mutants that lack particular exons or combina- 
tions of exons. This approach has been particularly useful in analyzing 
the structure and function of proteins that have evolved by exon shuffling 
(see, e.g., Gething et al. 1988). Because individual exons often encode 
independently folding polypeptide domains, there is a good chance that 
precise removal of a particular exon will not prejudice the folding of the 
remainder of the protein. 

6. When constructing chimeras between nonhomologous proteins, try to 
arrange for the junctions between the two coding regions to lie at the 
borders of predicted structural domains rather than within them. For 
example, the ectodomairis, transmembrane regions, or cytoplasmic tails of 
different transmembrane receptors should be exchanged at residues that 
are predicted to lie close to the appropriate transmembrane region. 
However, if the two proteins are homologous in sequence and function, the 
junctions should be located in segments of amino acid sequence that are 
identical or nearly so. 

7. Avoid changing proline residues (all proteins) or cysteine residues (secre- 
tory or cell-surface proteins). These amino acids are usually involved in 
forming and maintaining essential structural motifs. Proline is the 
residue that is frequently used to terminate a -helical regions, whereas 
pairs of cysteine residues form the stabilizing disulfide bonds that are a 
hallmark of many secretory and cell-surface proteins. 

8. When designing point mutations in secretory or transmembrane proteins, 
avoid altering asparagine or serine/ threonine residues that lie within 
potential sites for iV-linked glycosylation (Asn-X-Ser/Thr, where X is any 
amino acid other than proline). Mannose-rich oligosaccharides that are 
added posttranslationally to the asparagine residues in such consensus 
sequences are in some cases involved in assisting the nascent polypeptide 
to fold (Gallagher et al. 1988). 



Site-directed Mutagenesis of Cloned DNA 15.83 



INSERTION OF HEXAMERIC TINKERS INTO PROTEIN-CODING 
SEQUENCES 

The method given below, which is an elaboration of the work of Boeke (1981), 
is based on protocols published by Barany (1985a,b, 1987, 1988). Synthetic 
hexameric linkers are inserted quasi-randomly into the gene by targeting 
insertion to sites for frequently cutting restriction enzymes. The sequence of 
the hexameric linker is chosen so that no termination codons are inserted. A 
selectable marker ( Vieira and Messing 1982) is used to identify mutants that 
have incorporated the linker. Although straightforward, the approach is 
limited to insertions at preexisting restriction enzyme sites. Nevertheless, 
this approach can be very useful in the initial determination of functional 
domkiiis of ¥ protein. THe methodic thF following steps: 

• Cloning the target into a suitable plasmid (e.g., pUCl8/pUCl9 or pUC118/ 
pUCH9) 

• Linearization of the plasmid DNA by partial digestion with a restriction 
enzyme that cleaves frequently within the target sequence and generates 
cohesive termini 

• Ligation of a single-stranded hexameric linker to the cohesive termini 

• Removal of excess linkers by cleavage with the appropriate restriction 
enzyme 

• Ligation of a fragment of DNA carrying a selectable marker (usually kan) 
to the cohesive termini of the linkers 

• Selection of transformed bacteria that are resistant to kanamycin and 
ampicillin 

• Removal of the fragment of DNA carrying the selectable marker by cleav- 
age with the appropriate restriction enzyme 

• Recircularization of the linear DNA molecule lacking the kan gene 

• Transformation of bacteria with the recircularized plasmid 

• Screening of transformed colonies for those that carry plasmids containing 
a novel restriction site within the target sequence 

This method is therefore a variation of the technique described on pages 
15.32-15.50 that is used to construct linker-scanning mutations. A pre- 
assembled kit can be purchased from Pharmacia (TAB™ mutagenesis sys- 
tem) that contains the materials required for this type of mutagenesis. 



Site-directed Mutagenesis of Cloned DNA 15.85 



5 \..GCGC... 3 ' 
3' . . . C G C G . . . 5' 

Hha\ 

5 '...GCG C... 3 ' 
3'...C G C G . . . 5 • 

5 AATTCG 3 ' 

5...GCG AATTCGC... 3 ' 
3 - . . . C G C T T A A GCG... 5 < 

When the appropriate hexameric linker has been synthesized, proceed as 
described on pages 15.88-15.94. 



Site-directed Mutagenesis of Cloned DNA 15*87 



bromide has been added, use 1 x restriction enzyme buffer to adjust 
the volume of the solution in each tube to 49 /il. The concentrations of 
-> ethidium bromide in the three digestion mixtures are therefore 20 
jug/ml, 40 Atg/ml, and 60 fig/ml respectively. 

Caution: Ethidium bromide is a powerful mutagen and is moderately 
toxic. Follow precautions detailed in step lh, page 15.9. 

b. Add 10 units of the appropriate restriction enzyme to each tube, and 
immediately transfer the three tubes to a water bath set at 37°C. 
After 0, 5, 10, 20, 30, 45, and 60 minutes of incubation, transfer 
aliquots (5 fil) from each of the three tubes to fresh microfuge tubes 
containing 15 fil of 5 mM EDTA (pH 8.0). Heat each sample for 10 
minutes at 68°C. Store the tubes containing the aliquots on ice until 
all of the samples have been collected. 

c. Analyze the DNA in each sample by electrophoresis through a 0.8% 
agarose gel cast and run in 0.5 x TBE (see Appendix B) containing 
ethidium bromide (0.5 /ig/ml). As markers, use plasmid DNA that 
has been linearized by digestion with a restriction enzyme that 
cleaves at only one site. Under these conditions of electrophoresis, 
closed circular DNA migrates slightly faster than linear DNA and 
considerably faster than relaxed circular DNA. * 

d. Examine the gel by ultraviolet illumination, and determine the condi- 
tions that give the maximum yield of full-length linear molecules. 
Usually not more than 33% of the original closed circular plasmid 
DNA is converted to full-length linear molecules. 

Caution: Ultraviolet radiation is dangerous, particularly to the eyes. 
Follow precautions detailed in step li, page 15.9. 

e. Set up a large-scale digestion containing 20 /ig of closed circular DNA. 
Increase the volume of all of the components of the reaction propor- 
tionally, and incubate the reaction for the time required to achieve 

. maximal conversion of closed circular DNA to full-length linear mole- 
cules. 

5. Purify the full-length linear DNA by preparative agarose gel elec- 
trophoresis using one of the methods described in Chapter 6. Redissolve 
the DNA in TE (pH 7.6) at a concentration of 250 /-eg/ml. Approximately 
1-2 fig of full-length linear DNA will be required to complete the 
remainder of the procedure. 

6. a. Ligation of single-stranded hexameric linkers to linear plasmid DNA 

carrying protruding 5' termini and removal of excess linkers 

i. Phosphorylate approximately 10 fig of hexameric linkers with bac- 
teriophage T4 polynucleotide kinase as described on page 15.63. The 
total volume of the reaction should not exceed 10 /jlL 

ii. Ligate the phosphorylated linkers to the linear plasmid DNA as 
follows: 



Site-directed Mutagenesis of Cloned DNA 15*89 



iii. Add: 

"10 x bacteriophage T4 polynucleotide kinase buffer 2 ^1 

10 mM ATP 2 fil 

H 2 0 6jd 

bacteriophage T4 polynucleotide kinase 10 units 

Incubate the reaction at 37°C for 30 minutes, and then heat the 
mixture to 68°C for 15 minutes to inactivate the bacteriophage T4 
polynucleotide kinase. 

10 x Bacteriophage T4 polynucleotide kinase buffer 

0.5 m Tris Cl (pH 7.6) 

0.1 m MgCl ? 
50 mM< dithiothreitol 
1 mM spermidine HC1 
1 mM EDTA (pH 8.0) 

iv. Purify the DNA by extraction with phenol: chloroform, and recover 
the DNA from the aqueous phase by, precipitation with 0.5 volume of 
ammonium acetate and 2.5 volumes of ethanol. Store the tube for 10 
minutes on ice, arid recover the DNA by centrifugation at 12,000g 
for 15 minutes at 4°C in a microfuge. Remove the supernatant and 
wash the pellet of DNA carefully with 70% ethanol. Allow the pellet to 
dry at room temperature, and then redissolve it in 20 fil of TE (pH 
7.6). 

v. Proceed to step 7. 

7. Add: 

H 2 0 

10 x ligase buffer 
purified fragment of DNA that 
carries the kan T gene (step 3) (0.5 fig) 

Add 5 Weiss units of bacteriophage T4 DNA ligase and 10 mM ATP to a 
final concentration of 1 mM. Incubate the reaction for 6-8 hours at 16°C. 

In a separate tube, set up a control containing all of the components of 
the ligation mixture except the linearized plasmid DNA to which hexa- 
meric linkers have been added arid incubate as above. 

8. Use 10-/il aliquots of the control and test ligations to transform compe- 
tent E. coli of an appropriate strain (e.g., DH1 or DH5) to resistance to 
kanamycin and ampicillin. Select transformants on LB agar plates 
containing kanamycin and ampicillin (or carbenicillin), each at a concen- 
tration of 100 /xg/ml. If the experiment has gone well, the test ligation 
should generate approximately tenfold more colonies than the control 
ligation. 



18 m1 
5 til 

2 fil 



Site-directed Mutagenesis of Cloned DNA 15*91 



ampicillin arid kanamycin, each at a concentration of 100 fJLg/ml Use a 
bent glass rod and a pasteur pipette to scrape and squirt the colonies 
from the surfaces of the plates. Pool the bacterial suspension obtained 
from each of the plates, and recover the bacterial cells by centrifugation 
at 5000# for 10 minutes at 4°C. Remove the supernatant medium. 

16. Isolate the closed circular plasmid DNA from the pooled colonies by one 
of the small-scale methods described in Chapter 1. 

17. To remove the fragment carrying the kan T gene from the plasmids 
containing the synthetic linker, digest the preparation of pooled plasmid 
DNAs with a restriction enzyme that recognizes the site created by 
ligation of the hexameric linkers. 

18. When the restriction digest is complete, dilute the reaction mixture with 
1 x ligase buffer until the concentration of the plasmid DNA is < 3 
fig/ml Add approximately 5 Weiss units of bacteriophage T4 DNA ligase 
per 100 ^1 of reaction mixture. Incubate the reaction for 4 hours at 16°C. 

1 x Ligase buffer 
20 mM Tris Cl (pH 7.6) 

5 mM MgCl 2 #r 
5 mM dithiothreitol 
1 mM ATP 



19. Concentrate the DNA by extracting the ligation mixture twice with 
1-butanol (see Appendix E). Add 0.2 volume of 10 m ammonium acetate, 
and precipitate the DNA with 2.5 volumes of ice-cold ethanol. Store the 
tube in an ice bath for 30 minutes. 

20. Collect the DNA by centrifugation at 25,000^ for 30 minutes at 0°C. 
Carefully decant the ethanol, and wash the pellet of DNA with several 
milliliters of 70% ethanol at room temperature. Again, recover the DNA 
by centrifugation, and remove the 70% ethanol by careful aspiration. 
Stand the open tube on the bench until the last traces of ethanol have 
evaporated. Dissolve the DNA in 20 pi of TE (pH 7.6), and estimate the 
recovery of DNA by analyzing a small aliquot by electrophoresis through 
an agarose gel. 

21. Use approximately 50 ng of the DNA to transform competent E. coli 
(strain DH1 or DH5) to resistance to ampicillin only. Plate the trans- 
formants on LB agar medium containing ampicillin (100 /ig/mi), and 
incubate them for 18-24 hours at 37°C. Store the remainders of the 
ligation mixtures at -20°C. 

22. Pick 36 independent transformants, and establish small-scale (2-ml) 
cultures in LB medium containing ampicillin U00 /xg/ml). After the 



Site-directed Mutagenesis of Cloned DNA 15.93 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 



LJ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

3TlINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCED) OR EXHD3IT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




BLACK BORDERS 



□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 




